Fix aggte failure after pickle reload due to non-numeric column (Issue #71) by gsaco · Pull Request #73 · d2cml-ai/csdid

gsaco · 2026-01-13T02:12:24Z

Resolution of Issue #71

This pull request fixes Issue #71, where aggte() fails after reloading a csdid result from a pickle file. The attached notebook provides a minimal replication of the failure and demonstrates the corrected behavior.

aggte_rowid_jupytext.ipynb

Root cause:

After reloading a saved csdid object, the internal data frame used in aggte_fnc/compute_aggte.py may contain a non-numeric column named rowid. During aggregation, the code groups the data and applies mean() across remaining columns, causing pandas to raise:

TypeError: agg function failed [how->mean, dtype->object]

Fix:

Before the aggregation step, the fix removes the rowid column if present:

if 'rowid' in data.columns:
data = data.drop(columns=['rowid'])

This restores correct execution of aggte() for both freshly computed and reloaded csdid objects, without affecting existing results.

Update compute_aggte.py

199ec8e

alexanderquispe merged commit ecd128b into d2cml-ai:main Jan 13, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix aggte failure after pickle reload due to non-numeric column (Issue #71)#73

Fix aggte failure after pickle reload due to non-numeric column (Issue #71)#73
alexanderquispe merged 1 commit intod2cml-ai:mainfrom
gsaco:issue71

gsaco commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gsaco commented Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants